Linguistically enriched corpora for establishing variation in support verb constructions

نویسنده

  • Begoña Villada Moirón
چکیده

Many NLP tasks that require syntactic analysis necessitate an accurate description of the lexical components, morpho-syntactic constraints and the semantic idiosyncracies of fixed expressions. (Moon, 1998) and (Riehemann, 2001) show that many fixed expressions and idioms allow limited variation and modification inside their complementation. This paper discusses to what extent a corpus-based method can help us establish the variation and adjectival modification potential of Dutch support verb constructions. We also discuss what problems the data poses when applying an automated data-driven method to solve the problem.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Cross-Lingual Variation of Light Verb Constructions: Using Parallel Corpora and Automatic Alignment for Linguistic Research

Cross-lingual parallelism and small-scale language variation have recently become subject of research in both computational and theoretical linguistics. In this article, we use a parallel corpus and an automatic aligner to study English light verb constructions and their German translations. We show that parallel corpus data can provide new empirical evidence for better understanding the proper...

متن کامل

Discarding Noise in an Automatically Acquired Lexicon of Support verb Constructions

We applied data-driven methods to carry out automatic acquisition of Dutch prepositional support verb constructions (SVCs) in corpora (e.g., iets in de gaten houden (“keep an eye on something”)). This paper addresses the question whether linguistic diagnostics help to discard noise from thenbest lists and how to (semi-)automatically apply such linguistic diagnostics to parsed corpora. We show t...

متن کامل

Linguistic Evaluation of Support Verb Constructions by OpenLogos and Google Translate

This paper presents a systematic human evaluation of translations of English support verb constructions produced by a rule-based machine translation (RBMT) system (OpenLogos) and a statistical machine translation (SMT) system (Google Translate) for five languages: French, German, Italian, Portuguese and Spanish. We classify support verb constructions by means of their syntactic structure and se...

متن کامل

Nouns as Components of Support Verb Constructions in the Prague Dependency Treebank

Support Verb Constructions (SVCs) are combinations of a noun denoting an event or a state and a lexical verb. From the semantic point of view, the noun seems to be a part of a complex predicate rather than the object (or subject) of the verb, despite what the surface syntax suggests. The meaning is concentrated in the noun component, whereas the semantic content of the verb is reduced or genera...

متن کامل

ConFarm: Extracting Surface Representations of Verb and Noun Constructions from Dependency Annotated Corpora of Russian

ConFarm is a web service dedicated to extraction of surface representations of verb and noun constructions from dependency annotated corpora of Russian texts. Currently, the extraction of constructions with a specific lemma from SynTagRus and Russian National Corpus is available. The system provides flexible interface that allows users to finetune the output. Extracted constructions are grouped...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005